Maximum Entropy Weighting of Aligned Sequences of Proteins or DNA

نویسندگان

  • Anders Krogh
  • Graeme J. Mitchison
چکیده

In a family of proteins or other biological sequences like DNA the various subfamilies are often very unevenly represented. For this reason a scheme for assigning weights to each sequence can greatly improve performance at tasks such as database searching with profiles or other consensus models based on multiple alignments. A new weighting scheme for this type of database search is proposed. In a statistical description of the searching problem it is derived from the maximum entropy principle. It can be proved that, in a certain sense, it corrects for uneven representation. It is shown that finding the maximum entropy weights is an easy optimization problem for which standard techniques are applicable.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Maximum Entropy Weighting of Aligned Sequencesof Proteins or

In a family of proteins or other biological sequences like DNA the various subfamilies are often very unevenly represented. For this reason a scheme for assigning weights to each sequence can greatly improve performance at tasks such as database searching with prooles or other consensus models based on multiple alignments. A new weighting scheme for this type of database search is proposed. In ...

متن کامل

(مقاله کوتاه) تجزیه فیلوژنی و تکامل مولکولی لپتین

     In the current study, phylogenetic analysis and molecular evolution of the mammalian’s Leptin was investigated. Data was achieved and aligned by searching its genome database, while all examined mammals contained only a single copy of the Leptin. The nucleotide substitution rate of the sequences and molecular evolution of the Leptin were calculated by maximum likelihood and neighbor-joinin...

متن کامل

gpALIGNER: A Fast Algorithm for Global Pairwise Alignment of DNA Sequences

Bioinformatics, through the sequencing of the full genomes for many species, is increasingly relying on efficient global alignment tools exhibiting both high sensitivity and specificity. Many computational algorithms have been applied for solving the sequence alignment problem. Dynamic programming, statistical methods, approximation and heuristic algorithms are the most common methods appli...

متن کامل

Modeling residue usage in aligned protein sequences via maximum likelihood.

A computational method is presented for characterizing residue usage, i.e., site-specific residue frequencies, in aligned protein sequences. The method obtains frequency estimates that maximize the likelihood of the sequences in a simple model for sequence evolution, given a tree or a set of candidate trees computed by other methods. These maximum-likelihood frequencies constitute a profile of ...

متن کامل

The roles of EPIYA sequence to perturb the cellular signaling pathways and cancer risk

Abstract It was shown that several pathogenic bacterial effector proteins contain the Glu-Pro-Ile-Tyr-Ala (EPIYA) or a similar sequence. These bacterial EPIYA effectors are delivered into host cell via type III or IV secretion system, where they undergo tyrosine phosphorylation at the EPIYA sequences, which triggers interaction with multiple host cell SH2 domain-containing proteins and thereby...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Proceedings. International Conference on Intelligent Systems for Molecular Biology

دوره 3  شماره 

صفحات  -

تاریخ انتشار 1995